Overview

Dataset statistics

Number of variables15
Number of observations6362620
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory728.1 MiB
Average record size in memory120.0 B

Variable types

Numeric9
Categorical6

Alerts

name_orig has a high cardinality: 6353307 distinct valuesHigh cardinality
name_dest has a high cardinality: 2722362 distinct valuesHigh cardinality
step is highly overall correlated with dayHigh correlation
amount is highly overall correlated with oldbalance_dest and 3 other fieldsHigh correlation
oldbalance_org is highly overall correlated with newbalance_origHigh correlation
newbalance_orig is highly overall correlated with oldbalance_org and 1 other fieldsHigh correlation
oldbalance_dest is highly overall correlated with amount and 2 other fieldsHigh correlation
newbalance_dest is highly overall correlated with amount and 2 other fieldsHigh correlation
diff_org_balance is highly overall correlated with amount and 2 other fieldsHigh correlation
diff_dest_balance is highly overall correlated with amount and 1 other fieldsHigh correlation
day is highly overall correlated with stepHigh correlation
type is highly overall correlated with merchant_destHigh correlation
merchant_dest is highly overall correlated with typeHigh correlation
is_fraud is highly imbalanced (98.6%)Imbalance
is_flagged_fraud is highly imbalanced (> 99.9%)Imbalance
amount is highly skewed (γ1 = 30.99394948)Skewed
diff_org_balance is highly skewed (γ1 = -30.07475092)Skewed
diff_dest_balance is highly skewed (γ1 = -30.33109298)Skewed
name_orig is uniformly distributedUniform
oldbalance_org has 2102449 (33.0%) zerosZeros
newbalance_orig has 3609566 (56.7%) zerosZeros
oldbalance_dest has 2704388 (42.5%) zerosZeros
newbalance_dest has 2439433 (38.3%) zerosZeros
diff_org_balance has 1361836 (21.4%) zerosZeros
diff_dest_balance has 1032168 (16.2%) zerosZeros

Reproduction

Analysis started2023-02-05 11:53:20.689776
Analysis finished2023-02-05 12:04:23.653997
Duration11 minutes and 2.96 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

step
Real number (ℝ)

Distinct743
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean243.39725
Minimum1
Maximum743
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size48.5 MiB
2023-02-05T17:34:23.823034image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile16
Q1156
median239
Q3335
95-th percentile490
Maximum743
Range742
Interquartile range (IQR)179

Descriptive statistics

Standard deviation142.33197
Coefficient of variation (CV)0.58477232
Kurtosis0.32907056
Mean243.39725
Median Absolute Deviation (MAD)92
Skewness0.37517689
Sum1.5486442 × 109
Variance20258.39
MonotonicityIncreasing
2023-02-05T17:34:23.914847image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19 51352
 
0.8%
18 49579
 
0.8%
187 49083
 
0.8%
235 47491
 
0.7%
307 46968
 
0.7%
163 46352
 
0.7%
139 46054
 
0.7%
403 45155
 
0.7%
43 45060
 
0.7%
355 44787
 
0.7%
Other values (733) 5890739
92.6%
ValueCountFrequency (%)
1 2708
 
< 0.1%
2 1014
 
< 0.1%
3 552
 
< 0.1%
4 565
 
< 0.1%
5 665
 
< 0.1%
6 1660
 
< 0.1%
7 6837
 
0.1%
8 21097
0.3%
9 37628
0.6%
10 35991
0.6%
ValueCountFrequency (%)
743 8
 
< 0.1%
742 14
< 0.1%
741 22
< 0.1%
740 6
 
< 0.1%
739 10
< 0.1%
738 10
< 0.1%
737 10
< 0.1%
736 14
< 0.1%
735 12
< 0.1%
734 8
 
< 0.1%

type
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size48.5 MiB
CASH_OUT
2237500 
PAYMENT
2151495 
CASH_IN
1399284 
TRANSFER
532909 
DEBIT
 
41432

Length

Max length8
Median length7
Mean length7.422396
Min length5

Characters and Unicode

Total characters47225885
Distinct characters18
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPAYMENT
2nd rowPAYMENT
3rd rowTRANSFER
4th rowCASH_OUT
5th rowPAYMENT

Common Values

ValueCountFrequency (%)
CASH_OUT 2237500
35.2%
PAYMENT 2151495
33.8%
CASH_IN 1399284
22.0%
TRANSFER 532909
 
8.4%
DEBIT 41432
 
0.7%

Length

2023-02-05T17:34:24.101974image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-05T17:34:24.322857image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
cash_out 2237500
35.2%
payment 2151495
33.8%
cash_in 1399284
22.0%
transfer 532909
 
8.4%
debit 41432
 
0.7%

Most occurring characters

ValueCountFrequency (%)
A 6321188
13.4%
T 4963336
10.5%
S 4169693
8.8%
N 4083688
8.6%
C 3636784
 
7.7%
H 3636784
 
7.7%
_ 3636784
 
7.7%
E 2725836
 
5.8%
O 2237500
 
4.7%
U 2237500
 
4.7%
Other values (8) 9576792
20.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 43589101
92.3%
Connector Punctuation 3636784
 
7.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 6321188
14.5%
T 4963336
11.4%
S 4169693
9.6%
N 4083688
9.4%
C 3636784
8.3%
H 3636784
8.3%
E 2725836
 
6.3%
O 2237500
 
5.1%
U 2237500
 
5.1%
Y 2151495
 
4.9%
Other values (7) 7425297
17.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3636784
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 43589101
92.3%
Common 3636784
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 6321188
14.5%
T 4963336
11.4%
S 4169693
9.6%
N 4083688
9.4%
C 3636784
8.3%
H 3636784
8.3%
E 2725836
 
6.3%
O 2237500
 
5.1%
U 2237500
 
5.1%
Y 2151495
 
4.9%
Other values (7) 7425297
17.0%
Common
ValueCountFrequency (%)
_ 3636784
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 47225885
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 6321188
13.4%
T 4963336
10.5%
S 4169693
8.8%
N 4083688
8.6%
C 3636784
 
7.7%
H 3636784
 
7.7%
_ 3636784
 
7.7%
E 2725836
 
5.8%
O 2237500
 
4.7%
U 2237500
 
4.7%
Other values (8) 9576792
20.3%

amount
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct5316900
Distinct (%)83.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean179861.9
Minimum0
Maximum92445517
Zeros16
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size48.5 MiB
2023-02-05T17:34:24.669503image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2224.0995
Q113389.57
median74871.94
Q3208721.48
95-th percentile518634.2
Maximum92445517
Range92445517
Interquartile range (IQR)195331.91

Descriptive statistics

Standard deviation603858.23
Coefficient of variation (CV)3.3573437
Kurtosis1797.9567
Mean179861.9
Median Absolute Deviation (MAD)68393.655
Skewness30.993949
Sum1.1443929 × 1012
Variance3.6464476 × 1011
MonotonicityNot monotonic
2023-02-05T17:34:24.816489image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10000000 3207
 
0.1%
10000 88
 
< 0.1%
5000 79
 
< 0.1%
15000 68
 
< 0.1%
500 65
 
< 0.1%
100000 42
 
< 0.1%
21500 37
 
< 0.1%
120000 29
 
< 0.1%
135000 20
 
< 0.1%
0 16
 
< 0.1%
Other values (5316890) 6358969
99.9%
ValueCountFrequency (%)
0 16
< 0.1%
0.01 1
 
< 0.1%
0.02 3
 
< 0.1%
0.03 2
 
< 0.1%
0.04 1
 
< 0.1%
0.06 1
 
< 0.1%
0.07 1
 
< 0.1%
0.09 1
 
< 0.1%
0.1 1
 
< 0.1%
0.11 2
 
< 0.1%
ValueCountFrequency (%)
92445516.64 1
< 0.1%
73823490.36 1
< 0.1%
71172480.42 1
< 0.1%
69886731.3 1
< 0.1%
69337316.27 1
< 0.1%
67500761.29 1
< 0.1%
66761272.21 1
< 0.1%
64234448.19 1
< 0.1%
63847992.58 1
< 0.1%
63294839.63 1
< 0.1%

name_orig
Categorical

HIGH CARDINALITY  UNIFORM 

Distinct6353307
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size48.5 MiB
C1902386530
 
3
C363736674
 
3
C545315117
 
3
C724452879
 
3
C1784010646
 
3
Other values (6353302)
6362605 

Length

Max length11
Median length11
Mean length10.482323
Min length5

Characters and Unicode

Total characters66695040
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6344009 ?
Unique (%)99.7%

Sample

1st rowC1231006815
2nd rowC1666544295
3rd rowC1305486145
4th rowC840083671
5th rowC2048537720

Common Values

ValueCountFrequency (%)
C1902386530 3
 
< 0.1%
C363736674 3
 
< 0.1%
C545315117 3
 
< 0.1%
C724452879 3
 
< 0.1%
C1784010646 3
 
< 0.1%
C1677795071 3
 
< 0.1%
C1462946854 3
 
< 0.1%
C1999539787 3
 
< 0.1%
C2098525306 3
 
< 0.1%
C400299098 3
 
< 0.1%
Other values (6353297) 6362590
> 99.9%

Length

2023-02-05T17:34:25.607113image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
c1902386530 3
 
< 0.1%
c2098525306 3
 
< 0.1%
c363736674 3
 
< 0.1%
c1530544995 3
 
< 0.1%
c1065307291 3
 
< 0.1%
c2051359467 3
 
< 0.1%
c1832548028 3
 
< 0.1%
c400299098 3
 
< 0.1%
c1976208114 3
 
< 0.1%
c1999539787 3
 
< 0.1%
Other values (6353297) 6362590
> 99.9%

Most occurring characters

ValueCountFrequency (%)
1 8803448
13.2%
C 6362620
9.5%
2 6136135
9.2%
3 5699596
8.5%
4 5693146
8.5%
7 5669437
8.5%
5 5668010
8.5%
6 5667725
8.5%
0 5667074
8.5%
9 5665212
8.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 60332420
90.5%
Uppercase Letter 6362620
 
9.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 8803448
14.6%
2 6136135
10.2%
3 5699596
9.4%
4 5693146
9.4%
7 5669437
9.4%
5 5668010
9.4%
6 5667725
9.4%
0 5667074
9.4%
9 5665212
9.4%
8 5662637
9.4%
Uppercase Letter
ValueCountFrequency (%)
C 6362620
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 60332420
90.5%
Latin 6362620
 
9.5%

Most frequent character per script

Common
ValueCountFrequency (%)
1 8803448
14.6%
2 6136135
10.2%
3 5699596
9.4%
4 5693146
9.4%
7 5669437
9.4%
5 5668010
9.4%
6 5667725
9.4%
0 5667074
9.4%
9 5665212
9.4%
8 5662637
9.4%
Latin
ValueCountFrequency (%)
C 6362620
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 66695040
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 8803448
13.2%
C 6362620
9.5%
2 6136135
9.2%
3 5699596
8.5%
4 5693146
8.5%
7 5669437
8.5%
5 5668010
8.5%
6 5667725
8.5%
0 5667074
8.5%
9 5665212
8.5%

oldbalance_org
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1845844
Distinct (%)29.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean833883.1
Minimum0
Maximum59585040
Zeros2102449
Zeros (%)33.0%
Negative0
Negative (%)0.0%
Memory size48.5 MiB
2023-02-05T17:34:25.827191image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median14208
Q3107315.18
95-th percentile5823702.3
Maximum59585040
Range59585040
Interquartile range (IQR)107315.18

Descriptive statistics

Standard deviation2888242.7
Coefficient of variation (CV)3.4636062
Kurtosis32.964879
Mean833883.1
Median Absolute Deviation (MAD)14208
Skewness5.2491364
Sum5.3056813 × 1012
Variance8.3419457 × 1012
MonotonicityNot monotonic
2023-02-05T17:34:25.937313image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2102449
33.0%
184 918
 
< 0.1%
133 914
 
< 0.1%
195 912
 
< 0.1%
164 909
 
< 0.1%
181 908
 
< 0.1%
109 908
 
< 0.1%
157 902
 
< 0.1%
146 899
 
< 0.1%
128 898
 
< 0.1%
Other values (1845834) 4252003
66.8%
ValueCountFrequency (%)
0 2102449
33.0%
0.05 1
 
< 0.1%
0.18 1
 
< 0.1%
0.21 1
 
< 0.1%
0.44 1
 
< 0.1%
0.67 1
 
< 0.1%
1 370
 
< 0.1%
1.02 1
 
< 0.1%
1.37 1
 
< 0.1%
1.38 1
 
< 0.1%
ValueCountFrequency (%)
59585040.37 1
< 0.1%
57316255.05 1
< 0.1%
50399045.08 1
< 0.1%
49585040.37 1
< 0.1%
47316255.05 1
< 0.1%
45674547.89 1
< 0.1%
44892193.09 1
< 0.1%
43818855.3 1
< 0.1%
43686616.33 1
< 0.1%
42542664.27 1
< 0.1%

newbalance_orig
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct2682586
Distinct (%)42.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean855113.67
Minimum0
Maximum49585040
Zeros3609566
Zeros (%)56.7%
Negative0
Negative (%)0.0%
Memory size48.5 MiB
2023-02-05T17:34:26.173057image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3144258.41
95-th percentile5980262.3
Maximum49585040
Range49585040
Interquartile range (IQR)144258.41

Descriptive statistics

Standard deviation2924048.5
Coefficient of variation (CV)3.4194852
Kurtosis32.066985
Mean855113.67
Median Absolute Deviation (MAD)0
Skewness5.176884
Sum5.4407633 × 1012
Variance8.5500596 × 1012
MonotonicityNot monotonic
2023-02-05T17:34:26.392885image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3609566
56.7%
5888.64 4
 
< 0.1%
15073.44 4
 
< 0.1%
5122 4
 
< 0.1%
36875.73 4
 
< 0.1%
10528.49 4
 
< 0.1%
904.13 4
 
< 0.1%
18392.51 4
 
< 0.1%
32926.52 4
 
< 0.1%
4277.69 4
 
< 0.1%
Other values (2682576) 2753018
43.3%
ValueCountFrequency (%)
0 3609566
56.7%
0.01 1
 
< 0.1%
0.03 1
 
< 0.1%
0.05 1
 
< 0.1%
0.12 1
 
< 0.1%
0.13 1
 
< 0.1%
0.18 1
 
< 0.1%
0.21 1
 
< 0.1%
0.23 1
 
< 0.1%
0.3 1
 
< 0.1%
ValueCountFrequency (%)
49585040.37 1
< 0.1%
47316255.05 1
< 0.1%
43686616.33 1
< 0.1%
43673802.21 1
< 0.1%
41690842.64 1
< 0.1%
41432359.46 1
< 0.1%
40399045.08 1
< 0.1%
39585040.37 1
< 0.1%
38946233.02 1
< 0.1%
38939424.03 1
< 0.1%

name_dest
Categorical

Distinct2722362
Distinct (%)42.8%
Missing0
Missing (%)0.0%
Memory size48.5 MiB
C1286084959
 
113
C985934102
 
109
C665576141
 
105
C2083562754
 
102
C248609774
 
101
Other values (2722357)
6362090 

Length

Max length11
Median length11
Mean length10.481752
Min length2

Characters and Unicode

Total characters66691405
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2262704 ?
Unique (%)35.6%

Sample

1st rowM1979787155
2nd rowM2044282225
3rd rowC553264065
4th rowC38997010
5th rowM1230701703

Common Values

ValueCountFrequency (%)
C1286084959 113
 
< 0.1%
C985934102 109
 
< 0.1%
C665576141 105
 
< 0.1%
C2083562754 102
 
< 0.1%
C248609774 101
 
< 0.1%
C1590550415 101
 
< 0.1%
C451111351 99
 
< 0.1%
C1789550256 99
 
< 0.1%
C1360767589 98
 
< 0.1%
C1023714065 97
 
< 0.1%
Other values (2722352) 6361596
> 99.9%

Length

2023-02-05T17:34:26.803319image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
c1286084959 113
 
< 0.1%
c985934102 109
 
< 0.1%
c665576141 105
 
< 0.1%
c2083562754 102
 
< 0.1%
c248609774 101
 
< 0.1%
c1590550415 101
 
< 0.1%
c451111351 99
 
< 0.1%
c1789550256 99
 
< 0.1%
c1360767589 98
 
< 0.1%
c1023714065 97
 
< 0.1%
Other values (2722352) 6361596
> 99.9%

Most occurring characters

ValueCountFrequency (%)
1 8799996
13.2%
2 6133780
9.2%
3 5704404
8.6%
4 5691070
8.5%
8 5675627
8.5%
9 5668861
8.5%
7 5665128
8.5%
0 5664751
8.5%
6 5662897
8.5%
5 5662271
8.5%
Other values (2) 6362620
9.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 60328785
90.5%
Uppercase Letter 6362620
 
9.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 8799996
14.6%
2 6133780
10.2%
3 5704404
9.5%
4 5691070
9.4%
8 5675627
9.4%
9 5668861
9.4%
7 5665128
9.4%
0 5664751
9.4%
6 5662897
9.4%
5 5662271
9.4%
Uppercase Letter
ValueCountFrequency (%)
C 4211125
66.2%
M 2151495
33.8%

Most occurring scripts

ValueCountFrequency (%)
Common 60328785
90.5%
Latin 6362620
 
9.5%

Most frequent character per script

Common
ValueCountFrequency (%)
1 8799996
14.6%
2 6133780
10.2%
3 5704404
9.5%
4 5691070
9.4%
8 5675627
9.4%
9 5668861
9.4%
7 5665128
9.4%
0 5664751
9.4%
6 5662897
9.4%
5 5662271
9.4%
Latin
ValueCountFrequency (%)
C 4211125
66.2%
M 2151495
33.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 66691405
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 8799996
13.2%
2 6133780
9.2%
3 5704404
8.6%
4 5691070
8.5%
8 5675627
8.5%
9 5668861
8.5%
7 5665128
8.5%
0 5664751
8.5%
6 5662897
8.5%
5 5662271
8.5%
Other values (2) 6362620
9.5%

oldbalance_dest
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct3614697
Distinct (%)56.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1100701.7
Minimum0
Maximum3.5601589 × 108
Zeros2704388
Zeros (%)42.5%
Negative0
Negative (%)0.0%
Memory size48.5 MiB
2023-02-05T17:34:26.944987image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median132705.66
Q3943036.71
95-th percentile5147229.7
Maximum3.5601589 × 108
Range3.5601589 × 108
Interquartile range (IQR)943036.71

Descriptive statistics

Standard deviation3399180.1
Coefficient of variation (CV)3.0881938
Kurtosis948.67413
Mean1100701.7
Median Absolute Deviation (MAD)132705.66
Skewness19.921758
Sum7.0033464 × 1012
Variance1.1554425 × 1013
MonotonicityNot monotonic
2023-02-05T17:34:27.070834image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2704388
42.5%
10000000 615
 
< 0.1%
20000000 219
 
< 0.1%
30000000 86
 
< 0.1%
40000000 31
 
< 0.1%
102 21
 
< 0.1%
198 19
 
< 0.1%
125 18
 
< 0.1%
160 18
 
< 0.1%
132 18
 
< 0.1%
Other values (3614687) 3657187
57.5%
ValueCountFrequency (%)
0 2704388
42.5%
0.01 1
 
< 0.1%
0.03 1
 
< 0.1%
0.13 1
 
< 0.1%
0.33 1
 
< 0.1%
0.37 1
 
< 0.1%
0.79 1
 
< 0.1%
1 7
 
< 0.1%
1.39 1
 
< 0.1%
1.64 1
 
< 0.1%
ValueCountFrequency (%)
356015889.4 1
< 0.1%
355553416.3 1
< 0.1%
355381433.6 1
< 0.1%
355380483.5 1
< 0.1%
355185537.1 1
< 0.1%
328194464.9 1
< 0.1%
327998074.2 1
< 0.1%
327963024 1
< 0.1%
327852121.4 1
< 0.1%
327827763.4 1
< 0.1%

newbalance_dest
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct3555499
Distinct (%)55.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1224996.4
Minimum0
Maximum3.5617928 × 108
Zeros2439433
Zeros (%)38.3%
Negative0
Negative (%)0.0%
Memory size48.5 MiB
2023-02-05T17:34:27.259254image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median214661.44
Q31111909.2
95-th percentile5515715.9
Maximum3.5617928 × 108
Range3.5617928 × 108
Interquartile range (IQR)1111909.2

Descriptive statistics

Standard deviation3674128.9
Coefficient of variation (CV)2.9992978
Kurtosis862.15651
Mean1224996.4
Median Absolute Deviation (MAD)214661.44
Skewness19.352302
Sum7.7941866 × 1012
Variance1.3499223 × 1013
MonotonicityNot monotonic
2023-02-05T17:34:27.447113image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2439433
38.3%
10000000 53
 
< 0.1%
971418.91 32
 
< 0.1%
19169204.93 29
 
< 0.1%
1254956.07 25
 
< 0.1%
16532032.16 25
 
< 0.1%
1412484.09 22
 
< 0.1%
4743010.67 21
 
< 0.1%
1178808.14 21
 
< 0.1%
7364724.84 21
 
< 0.1%
Other values (3555489) 3922938
61.7%
ValueCountFrequency (%)
0 2439433
38.3%
0.01 1
 
< 0.1%
0.33 1
 
< 0.1%
1.39 1
 
< 0.1%
1.64 1
 
< 0.1%
1.74 1
 
< 0.1%
2.15 1
 
< 0.1%
2.45 1
 
< 0.1%
2.71 1
 
< 0.1%
2.76 1
 
< 0.1%
ValueCountFrequency (%)
356179278.9 1
< 0.1%
356015889.4 1
< 0.1%
355553416.3 2
< 0.1%
355381433.6 1
< 0.1%
355380483.5 1
< 0.1%
355185537.1 1
< 0.1%
328431698.2 1
< 0.1%
328194464.9 1
< 0.1%
327998074.2 1
< 0.1%
327963024 1
< 0.1%

is_fraud
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size48.5 MiB
0
6354407 
1
 
8213

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters6362620
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 6354407
99.9%
1 8213
 
0.1%

Length

2023-02-05T17:34:27.635489image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-05T17:34:27.722648image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0 6354407
99.9%
1 8213
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 6354407
99.9%
1 8213
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6362620
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6354407
99.9%
1 8213
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 6362620
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 6354407
99.9%
1 8213
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6362620
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 6354407
99.9%
1 8213
 
0.1%

is_flagged_fraud
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size48.5 MiB
0
6362604 
1
 
16

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters6362620
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 6362604
> 99.9%
1 16
 
< 0.1%

Length

2023-02-05T17:34:27.798807image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-05T17:34:27.893568image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0 6362604
> 99.9%
1 16
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 6362604
> 99.9%
1 16
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6362620
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6362604
> 99.9%
1 16
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 6362620
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 6362604
> 99.9%
1 16
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6362620
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 6362604
> 99.9%
1 16
 
< 0.1%

diff_org_balance
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct907958
Distinct (%)14.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-201092.08
Minimum-92445516
Maximum0
Zeros1361836
Zeros (%)21.4%
Negative5000784
Negative (%)78.6%
Memory size48.5 MiB
2023-02-05T17:34:28.003164image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-92445516
5-th percentile-700716.05
Q1-249640.25
median-68677
Q3-2954
95-th percentile0
Maximum0
Range92445516
Interquartile range (IQR)246686.25

Descriptive statistics

Standard deviation606650.43
Coefficient of variation (CV)-3.0167793
Kurtosis1753.2688
Mean-201092.08
Median Absolute Deviation (MAD)68677
Skewness-30.074751
Sum-1.2794725 × 1012
Variance3.6802474 × 1011
MonotonicityNot monotonic
2023-02-05T17:34:28.129760image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1361836
 
21.4%
-10000000 1085
 
< 0.1%
-1000 168
 
< 0.1%
-10000 133
 
< 0.1%
-20000 121
 
< 0.1%
-84 108
 
< 0.1%
-1944 108
 
< 0.1%
-2514 105
 
< 0.1%
-1982 105
 
< 0.1%
-510 104
 
< 0.1%
Other values (907948) 4998747
78.6%
ValueCountFrequency (%)
-92445516 1
< 0.1%
-73823490 1
< 0.1%
-71172480 1
< 0.1%
-69886731 1
< 0.1%
-69337316 1
< 0.1%
-67500761 1
< 0.1%
-66761272 1
< 0.1%
-64234448 1
< 0.1%
-63847992 1
< 0.1%
-63294839 1
< 0.1%
ValueCountFrequency (%)
0 1361836
21.4%
-1 65
 
< 0.1%
-2 70
 
< 0.1%
-3 59
 
< 0.1%
-4 59
 
< 0.1%
-5 71
 
< 0.1%
-6 83
 
< 0.1%
-7 85
 
< 0.1%
-8 79
 
< 0.1%
-9 71
 
< 0.1%

diff_dest_balance
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct1197049
Distinct (%)18.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-304156.23
Minimum-1.8489103 × 108
Maximum12930418
Zeros1032168
Zeros (%)16.2%
Negative5273086
Negative (%)82.9%
Memory size48.5 MiB
2023-02-05T17:34:28.349135image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-1.8489103 × 108
5-th percentile-1056983.3
Q1-308391
median-27748
Q3-3815
95-th percentile0
Maximum12930418
Range1.9782145 × 108
Interquartile range (IQR)304576

Descriptive statistics

Standard deviation1362380.9
Coefficient of variation (CV)-4.4792142
Kurtosis1554.52
Mean-304156.23
Median Absolute Deviation (MAD)27748
Skewness-30.331093
Sum-1.9352305 × 1012
Variance1.8560817 × 1012
MonotonicityNot monotonic
2023-02-05T17:34:28.474535image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1032168
 
16.2%
-2216 168
 
< 0.1%
-1181 168
 
< 0.1%
-1005 166
 
< 0.1%
-1339 166
 
< 0.1%
-1722 166
 
< 0.1%
-2752 166
 
< 0.1%
-497 165
 
< 0.1%
-1797 165
 
< 0.1%
-1259 165
 
< 0.1%
Other values (1197039) 5328957
83.8%
ValueCountFrequency (%)
-184891033 1
< 0.1%
-164559608 1
< 0.1%
-147646980 1
< 0.1%
-142344960 1
< 0.1%
-139773462 1
< 0.1%
-138674632 1
< 0.1%
-135767972 1
< 0.1%
-135489952 1
< 0.1%
-135001522 1
< 0.1%
-133522544 1
< 0.1%
ValueCountFrequency (%)
12930418 1
< 0.1%
9385209 1
< 0.1%
5672835 1
< 0.1%
5315828 1
< 0.1%
5263390 1
< 0.1%
5114161 1
< 0.1%
4834325 1
< 0.1%
4755039 1
< 0.1%
4565838 1
< 0.1%
4525018 1
< 0.1%

merchant_dest
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size48.5 MiB
0
4211125 
1
2151495 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters6362620
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 4211125
66.2%
1 2151495
33.8%

Length

2023-02-05T17:34:28.647691image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-02-05T17:34:28.761203image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0 4211125
66.2%
1 2151495
33.8%

Most occurring characters

ValueCountFrequency (%)
0 4211125
66.2%
1 2151495
33.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6362620
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4211125
66.2%
1 2151495
33.8%

Most occurring scripts

ValueCountFrequency (%)
Common 6362620
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4211125
66.2%
1 2151495
33.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6362620
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4211125
66.2%
1 2151495
33.8%

day
Real number (ℝ)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.491907
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size48.5 MiB
2023-02-05T17:34:28.839721image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q17
median10
Q314
95-th percentile21
Maximum31
Range30
Interquartile range (IQR)7

Descriptive statistics

Standard deviation5.9218122
Coefficient of variation (CV)0.56441716
Kurtosis0.33230145
Mean10.491907
Median Absolute Deviation (MAD)4
Skewness0.37784774
Sum66756016
Variance35.06786
MonotonicityIncreasing
2023-02-05T17:34:28.933988image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
1 574255
 
9.0%
2 455238
 
7.2%
8 449637
 
7.1%
6 441005
 
6.9%
13 428583
 
6.7%
17 425766
 
6.7%
7 420583
 
6.6%
9 417919
 
6.6%
11 417859
 
6.6%
15 401282
 
6.3%
Other values (21) 1930493
30.3%
ValueCountFrequency (%)
1 574255
9.0%
2 455238
7.2%
3 1070
 
< 0.1%
4 28240
 
0.4%
5 9789
 
0.2%
6 441005
6.9%
7 420583
6.6%
8 449637
7.1%
9 417919
6.6%
10 392945
6.2%
ValueCountFrequency (%)
31 272
 
< 0.1%
30 11287
 
0.2%
29 54890
0.9%
28 14661
 
0.2%
27 8578
 
0.1%
26 13885
 
0.2%
25 57853
0.9%
24 32709
0.5%
23 51012
0.8%
22 53437
0.8%

Interactions

2023-02-05T17:33:12.500216image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:12.118172image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:37.047711image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:00.723723image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:23.075554image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:45.884050image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:07.285662image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:28.521462image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:50.161267image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:33:14.960179image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:14.758719image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:39.771350image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:03.306922image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:25.418679image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:48.274182image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:09.394700image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:30.737625image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:52.592649image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:33:17.413320image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:17.558399image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:42.462627image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:05.687226image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:28.087046image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:50.569563image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:11.754813image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:33.052178image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:55.243532image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:33:20.304182image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:20.320550image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:45.339994image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:07.872724image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:30.666806image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:53.185451image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:14.166091image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:35.597834image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:57.647957image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:33:22.615611image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:23.075367image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:47.943675image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:10.144125image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:33.291721image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:55.391564image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:16.546824image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:38.010130image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:33:00.297657image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:33:25.218971image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:25.991657image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:50.527693image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:12.805413image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:35.924335image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:57.914306image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:19.046165image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:40.413749image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:33:02.873150image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:33:27.684017image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:28.731784image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:53.002931image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:15.320026image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:38.142344image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:00.277674image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:21.325485image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:43.041190image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:33:05.268723image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:33:30.209755image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:31.397486image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:55.790488image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:17.967325image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:41.006545image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:02.480525image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:23.676749image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:45.473021image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:33:07.891942image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:33:32.593809image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:33.803861image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:30:58.408540image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:20.776641image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:31:43.491849image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:04.887171image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:26.113752image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:32:47.706555image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-02-05T17:33:10.303348image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2023-02-05T17:34:29.091829image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
stepamountoldbalance_orgnewbalance_origoldbalance_destnewbalance_destdiff_org_balancediff_dest_balancedaytypeis_fraudis_flagged_fraudmerchant_dest
step1.0000.001-0.006-0.011-0.005-0.005-0.0020.0190.9980.0110.0590.0060.010
amount0.0011.0000.048-0.0710.5950.670-0.879-0.5760.0070.0500.0490.0140.022
oldbalance_org-0.0060.0481.0000.8030.024-0.0080.0420.391-0.0060.2130.0310.0030.162
newbalance_orig-0.011-0.0710.8031.0000.044-0.0940.0310.621-0.0120.2380.0190.0050.180
oldbalance_dest-0.0050.5950.0240.0441.0000.936-0.590-0.199-0.0030.0170.0020.0000.022
newbalance_dest-0.0050.670-0.008-0.0940.9361.000-0.610-0.375-0.0030.0270.0020.0000.025
diff_org_balance-0.002-0.8790.0420.031-0.590-0.6101.0000.328-0.0070.0500.0000.0140.022
diff_dest_balance0.019-0.5760.3910.621-0.199-0.3750.3281.0000.0150.0850.0670.0100.040
day0.9980.007-0.006-0.012-0.003-0.003-0.0070.0151.0000.0110.0620.0060.011
type0.0110.0500.2130.2380.0170.0270.0500.0850.0111.0000.0590.0051.000
is_fraud0.0590.0490.0310.0190.0020.0020.0000.0670.0620.0591.0000.0430.026
is_flagged_fraud0.0060.0140.0030.0050.0000.0000.0140.0100.0060.0050.0431.0000.001
merchant_dest0.0100.0220.1620.1800.0220.0250.0220.0400.0111.0000.0260.0011.000

Missing values

2023-02-05T17:33:35.759628image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-02-05T17:33:43.435096image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

steptypeamountname_origoldbalance_orgnewbalance_origname_destoldbalance_destnewbalance_destis_fraudis_flagged_frauddiff_org_balancediff_dest_balancemerchant_destday
01PAYMENT9839.6400C1231006815170136.0000160296.3600M19797871550.00000.0000000-983911
11PAYMENT1864.2800C166654429521249.000019384.7200M20442822250.00000.0000000-186411
21TRANSFER181.0000C1305486145181.00000.0000C5532640650.00000.0000100-18101
31CASH_OUT181.0000C840083671181.00000.0000C3899701021182.00000.00001002100101
41PAYMENT11668.1400C204853772041554.000029885.8600M12307017030.00000.0000000-1166811
51PAYMENT7817.7100C9004563853860.000046042.2900M5734872740.00000.0000000-781711
61PAYMENT7107.7700C154988899183195.0000176087.2300M4080691190.00000.0000000-710711
71PAYMENT7861.6400C1912850431176087.2300168225.5900M6333263330.00000.0000000-786111
81PAYMENT4024.3600C12650129282671.00000.0000M11769321040.00000.000000-1353-402411
91DEBIT5337.7700C71241012441720.000036382.2300C19560086041898.000040348.7900000-378801
steptypeamountname_origoldbalance_orgnewbalance_origname_destoldbalance_destnewbalance_destis_fraudis_flagged_frauddiff_org_balancediff_dest_balancemerchant_destday
6362610742TRANSFER63416.9900C77807100863416.99000.0000C18125528600.00000.0000100-63416031
6362611742CASH_OUT63416.9900C99495068463416.99000.0000C1662241365276433.1800339850.1700100-126833031
6362612743TRANSFER1258818.8200C15313014701258818.82000.0000C14709985630.00000.0000100-1258818031
6362613743CASH_OUT1258818.8200C14361187061258818.82000.0000C1240760502503464.50001762283.3300100-2517637031
6362614743TRANSFER339682.1300C2013999242339682.13000.0000C18504239040.00000.0000100-339682031
6362615743CASH_OUT339682.1300C786484425339682.13000.0000C7769192900.0000339682.1300100-679364031
6362616743TRANSFER6311409.2800C15290082456311409.28000.0000C18818418310.00000.0000100-6311409031
6362617743CASH_OUT6311409.2800C11629223336311409.28000.0000C136512589068488.84006379898.1100100-12622818031
6362618743TRANSFER850002.5200C1685995037850002.52000.0000C20803885130.00000.0000100-850002031
6362619743CASH_OUT850002.5200C1280323807850002.52000.0000C8732211896510099.11007360101.6300100-1700005031